40 research outputs found

    BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images

    Full text link
    In cryo-electron microscopy (EM), molecular structures are determined from large numbers of projection images of individual particles. To harness the full power of this single-molecule information, we use the Bayesian inference of EM (BioEM) formalism. By ranking structural models using posterior probabilities calculated for individual images, BioEM in principle addresses the challenge of working with highly dynamic or heterogeneous systems not easily handled in traditional EM reconstruction. However, the calculation of these posteriors for large numbers of particles and models is computationally demanding. Here we present highly parallelized, GPU-accelerated computer software that performs this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI parallelization combined with both CPU and GPU computing. The resulting BioEM software scales nearly ideally both on pure CPU and on CPU+GPU architectures, thus enabling Bayesian analysis of tens of thousands of images in a reasonable time. The general mathematical framework and robust algorithms are not limited to cryo-electron microscopy but can be generalized for electron tomography and other imaging experiments

    A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov-Poisson equation

    Full text link
    This paper presents an optimized and scalable semi-Lagrangian solver for the Vlasov-Poisson system in six-dimensional phase space. Grid-based solvers of the Vlasov equation are known to give accurate results. At the same time, these solvers are challenged by the curse of dimensionality resulting in very high memory requirements, and moreover, requiring highly efficient parallelization schemes. In this paper, we consider the 6d Vlasov-Poisson problem discretized by a split-step semi-Lagrangian scheme, using successive 1d interpolations on 1d stripes of the 6d domain. Two parallelization paradigms are compared, a remapping scheme and a classical domain decomposition approach applied to the full 6d problem. From numerical experiments, the latter approach is found to be superior in the massively parallel case in various respects. We address the challenge of artificial time step restrictions due to the decomposition of the domain by introducing a blocked one-sided communication scheme for the purely electrostatic case and a rotating mesh for the case with a constant magnetic field. In addition, we propose a pipelining scheme that enables to hide the costs for the halo communication between neighbor processes efficiently behind useful computation. Parallel scalability on up to 65k processes is demonstrated for benchmark problems on a supercomputer

    The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis

    Get PDF
    We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’)

    A Hybrid MPI-OpenMP Parallel Implementation for pseudospectral simulations with application to Taylor-Couette Flow

    Full text link
    A hybrid-parallel direct-numerical-simulation method with application to turbulent Taylor-Couette flow is presented. The Navier-Stokes equations are discretized in cylindrical coordinates with the spectral Fourier-Galerkin method in the axial and azimuthal directions, and high-order finite differences in the radial direction. Time is advanced by a second-order, semi-implicit projection scheme, which requires the solution of five Helmholtz/Poisson equations, avoids staggered grids and renders very small slip velocities. Nonlinear terms are computed with the pseudospectral method. The code is parallelized using a hybrid MPI-OpenMP strategy, which is simpler to implement, reduces inter-node communications and is more efficient compared to a flat MPI parallelization. A strong scaling study shows that the hybrid code maintains very good scalability up to more than 20000 processor cores and thus allows to perform simulations at higher resolutions than previously feasible, and opens up the possibility to simulate turbulent Taylor-Couette flows at Reynolds numbers up to O(105)\mathcal{O}(10^5). This enables to probe hydrodynamic turbulence in Keplerian flows in experimentally relevant regimes.Comment: 30 pages, 11 figure

    An Efficient Particle Tracking Algorithm for Large-Scale Parallel Pseudo-Spectral Simulations of Turbulence

    Get PDF
    Particle tracking in large-scale numerical simulations of turbulent flows presents one of the major bottlenecks in parallel performance and scaling efficiency. Here, we describe a particle tracking algorithm for large-scale parallel pseudo-spectral simulations of turbulence which scales well up to billions of tracer particles on modern high-performance computing architectures. We summarize the standard parallel methods used to solve the fluid equations in our hybrid MPI/OpenMP implementation. As the main focus, we describe the implementation of the particle tracking algorithm and document its computational performance. To address the extensive inter-process communication required by particle tracking, we introduce a task-based approach to overlap point-to-point communications with computations, thereby enabling improved resource utilization. We characterize the computational cost as a function of the number of particles tracked and compare it with the flow field computation, showing that the cost of particle tracking is very small for typical applications

    Spherically Symmetric Simulation with Boltzmann Neutrino Transport of Core Collapse and Post-Bounce Evolution of a 15 Solar Mass Star

    Get PDF
    We present a spherically symmetric, Newtonian core-collapse simulation of a 15 solar mass star with a 1.28 solar mass iron core. The time-, energy-, and angle-dependent transport of electron neutrinos and antineutrinos was treated with a new code which iteratively solves the Boltzmann equation and the equations for neutrino number, energy and momentum to order O(v/c) in the velocity v of the stellar medium. The supernova shock expands to a maximum radius of 350 km instead of only about 240 km as in a comparable calculation with multi-group flux-limited diffusion (MGFLD) by Bruenn, Mezzacappa, & Dineva (1995). This may be explained by stronger neutrino heating due to the more accurate transport in our model. Nevertheless, after 180 ms of expansion the shock finally recedes to a radius around 250 km (compared to about 170 km in the MGFLD run). The effect of an accurate neutrino transport is helpful, but not large enough to cause an explosion of the considered 15 solar mass star. Therefore postshock convection and/or an enhancement of the core neutrino luminosity by convection or reduced neutrino opacities in the neutron star seem necessary for neutrino-driven explosions of such stars. We find an electron fraction Y_e > 0.5 in the neutrino-heated matter, which suggests that the overproduction problem of neutron-rich nuclei with mass numbers around A = 90 in exploding models may be absent when a Boltzmann solver is used for the electron neutrino and antineutrino transport.Comment: 6 pages, LaTex, 3 encapsulated postscript figures, revised and shortened version. Astrophys. J., Letters, accepte

    Electron inertia effects in 3D hybrid-kinetic collisionless plasma turbulence

    Full text link
    The effects of the electron inertia on the current sheets that are formed out of kinetic turbulence are relevant to understand the importance of coherent structures in turbulence and the nature of turbulence at the dissipation scales. We investigate this problem by carrying out 3D hybrid-kinetic Particle-in-Cell (PIC) simulations of decaying kinetic turbulence with our CHIEF code. The main distinguishing feature of this code is an implementation of the electron inertia without approximations. Our simulation results show that the electron inertia plays an important role in regulating and limiting the largest values of current density in both real and wavenumber Fourier space, in particular near and, unexpectedly, even above electron scales. In addition, the electric field associated to the electron inertia dominates most of the strongest current sheets. The electron inertia is thus important to accurately describe the properties of current sheets formed in turbulence at electron scales.Comment: 34 pages, 10 figures. Revised version. Published in Physics of Plasma
    corecore